Skip to content

Add Jmh benchmarks along with corresponding rust benchmarks#8597

Open
robert3005 wants to merge 4 commits into
developfrom
mp/jni-bench
Open

Add Jmh benchmarks along with corresponding rust benchmarks#8597
robert3005 wants to merge 4 commits into
developfrom
mp/jni-bench

Conversation

@robert3005

Copy link
Copy Markdown
Contributor

Add Jmh benchmarks for java bindings with corresponding rust version of those
benchmarks

mprammer and others added 3 commits June 25, 2026 17:31
New `vortex-jni-bench` module (JMH) that stresses the vortex-jni read boundary — JNI plus the
Arrow C Data Interface — which is the path an Iceberg FormatModel takes to read Vortex from the
JVM. Three query shapes (full scan, projection, selective filter) over a synthetic six-column
table, consumed column-at-a-time so the numbers reflect format/boundary cost rather than per-row
JVM allocation. Includes a batch-granularity diagnostic (Vortex coalesces to ~64K-row read
batches regardless of write chunk) and a README with run instructions.

Must run against a --release native lib (VORTEX_SKIP_MAKE_TEST_FILES=true to preserve it). v2
TODO: a native Rust criterion read of the same file as a floor, to quote boundary overhead vs
native.

Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: mprammer <martin@spiraldb.com>
…ix units, add guards

A codex-run gauntlet (fresh/correctness/maint) flagged the first cut as overclaiming relative
to what it measured. This commit fixes that:

- Isolate native pushdown from JVM-side work: add projectionControl (full scan, consume id,y
  in Java) and filterControl (full scan, filter cat='alpha' in Java). The pushdown speedup is
  now projection-vs-projectionControl (~4.1x) and selectiveFilter-vs-filterControl (~4.6x),
  not the confounded ~6x-vs-fullScan. (M2)
- fullScan now consumes all six columns at the buffer level (z, cat, tag added), so the
  "all-six-column scan" number is honest (~40M rows/s). (M1)
- @OperationsPerInvocation(ROWS) so JMH reports input rows/s directly, not scans/s. (M3)
- @setup validates the file before measuring: exact row count, cat='alpha' returns ROWS/|CATS|,
  projection schema is exactly [id,y] — fast garbage can't be cited. (M5)
- Gradle guard fails the jmh task unless VORTEX_SKIP_MAKE_TEST_FILES=true, so a plain run can't
  silently rebuild + measure the debug lib. (M6)
- @threads(1); tag carries a 10% null rate; README documents the synthetic-data caveats.

Read path returns string columns as VarCharVector (Utf8), not ViewVarCharVector — matches the
existing TestMinimal read path. Native floor for a boundary-overhead % remains the v2 TODO (M4).

Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: mprammer <martin@spiraldb.com>
Signed-off-by: Robert Kruszewski <github@robertk.io>
@robert3005 robert3005 requested a review from a team June 25, 2026 18:21
@codspeed-hq

codspeed-hq Bot commented Jun 25, 2026

Copy link
Copy Markdown

Merging this PR will not alter performance

⚠️ Unknown Walltime execution environment detected

Using the Walltime instrument on standard Hosted Runners will lead to inconsistent data.

For the most accurate results, we recommend using CodSpeed Macro Runners: bare-metal machines fine-tuned for performance measurement consistency.

⚡ 5 improved benchmarks
❌ 3 regressed benchmarks
✅ 1581 untouched benchmarks
⏩ 4 skipped benchmarks1

Warning

Please fix the performance issues or acknowledge them on CodSpeed.

Performance Changes

Mode Benchmark BASE HEAD Efficiency
Simulation chunked_bool_canonical_into[(1000, 10)] 15.9 µs 26.7 µs -40.29%
Simulation chunked_varbinview_into_canonical[(1000, 10)] 169.1 µs 205.8 µs -17.83%
Simulation slice_empty_vortex 310 ns 368.3 ns -15.84%
Simulation bitwise_not_vortex_buffer_mut[128] 273.6 ns 215.3 ns +27.1%
Simulation bitwise_not_vortex_buffer_mut[1024] 333.9 ns 275.6 ns +21.17%
Simulation bitwise_not_vortex_buffer_mut[2048] 427.8 ns 369.4 ns +15.79%
Simulation chunked_varbinview_canonical_into[(100, 100)] 259.6 µs 224.5 µs +15.65%
Simulation chunked_varbinview_into_canonical[(100, 100)] 306.8 µs 271.9 µs +12.84%

Tip

Investigate this regression by commenting @codspeedbot fix this regression on this PR, or directly use the CodSpeed MCP with your agent.


Comparing mp/jni-bench (60deb99) with develop (bdbf6c4)

Open in CodSpeed

Footnotes

  1. 4 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

@robert3005 robert3005 added the changelog/chore A trivial change label Jun 25, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

changelog/chore A trivial change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants